Detect and correct worker node over-provisioning#13220
Conversation
|
Hello @@copilot, I noticed that you’re changing an .swr file or any file under src/Package/MSBuild.VSSetup.. Please make sure to validate this change by an experimental VS insertion. This is accomplished by pushing to an exp/* branch, which requires write permissions to this repo. |
Rebased on PR #13256 and integrated the improved node detection logic. Key improvements in commit f6e3056:
The over-provisioning detection now uses the more robust node discovery mechanism, providing better consistency across platforms and more accurate identification of worker nodes vs. main MSBuild processes. |
Rebased on the updated PR #13256 (commit 420cfb9). The branch now includes all improvements from the latest version of PR #13256:
My implementation remains fully compatible since it calls All 8 unit tests pass ✅ Note: The branch history has diverged from the remote due to the rebase. A force push will be needed to update the remote branch. |
f6e3056 to
919b1d4
Compare
|
@copilot rebase this yet one more time now that the parent PR has merged to main. |
|
@baronfel Unfortunately I hit an unexpected error while processing your comment. I've automatically reported this to GitHub. You can ask me to try again later by mentioning me in a new comment. If you want to contact GitHub about this error, please mention the following identifier so they can better serve you: Sorry for the inconvenience! |
|
@copilot rebase this yet one more time now that the parent PR has merged to main. |
|
@baronfel Unfortunately I hit an unexpected error while processing your comment. I've automatically reported this to GitHub. You can ask me to try again later by mentioning me in a new comment. If you want to contact GitHub about this error, please mention the following identifier so they can better serve you: Sorry for the inconvenience! |
There was a problem hiding this comment.
Pull request overview
This PR implements detection and correction of worker node over-provisioning when multiple MSBuild instances run concurrently. It builds upon the improved node discovery from PR #13256, using NodeMode filtering to accurately count and manage worker nodes across the system.
Changes:
- Implements selective node termination based on system-wide node count vs. NUM_PROCS/2 threshold
- Adds cross-platform process command line retrieval with optimized ArrayPool usage
- Integrates NodeMode extraction helpers for accurate node type identification
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
src/Shared/ProcessExtensions.cs |
Cross-platform command line retrieval via WMI (Windows), /proc (Linux), and sysctl (macOS/BSD) with ArrayPool optimization |
src/Framework/NodeMode.cs |
Added ExtractFromCommandLine helper and span-based TryParse overload for efficient NodeMode parsing |
src/Build/BackEnd/Components/Communications/NodeProviderOutOfProcBase.cs |
Core over-provisioning detection logic with DetermineNodesForReuse, CountSystemWideActiveNodes, and NodeMode-filtered process discovery |
src/Build/BackEnd/Node/OutOfProcServerNode.cs |
Server node self-termination when count > 1 to prevent over-provisioning |
src/Framework/ChangeWaves.cs |
Added Wave18_6 feature flag for gating the new functionality |
src/Shared/Debugging/DebugUtils.cs |
Refactored to use NodeModeHelper.ExtractFromCommandLine |
src/Utilities.UnitTests/ProcessExtensions_Tests.cs |
Tests for TryGetCommandLine with timing diagnostics |
src/Build.UnitTests/BackEnd/NodeProviderOutOfProc_Tests.cs |
Comprehensive unit tests (8 scenarios) for over-provisioning detection logic |
src/Framework/NativeMethods.cs |
Added SupportedOSPlatformGuard attribute for IsOSX property |
| Project files | Removed redundant ProcessExtensions.cs references; added AllowUnsafeBlocks to MSBuild.UnitTests |
src/Build/BackEnd/Components/Communications/NodeProviderOutOfProcBase.cs
Show resolved
Hide resolved
src/Build/BackEnd/Components/Communications/NodeProviderOutOfProcBase.cs
Outdated
Show resolved
Hide resolved
src/Build/BackEnd/Components/Communications/NodeProviderOutOfProcBase.cs
Show resolved
Hide resolved
RAR will wait for later due to layering issues.
919b1d4 to
7a70db1
Compare
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Worker Node Over-Provisioning Detection
Implementing feature to detect and correct over-provisioning of worker nodes when multiple MSBuild instances run concurrently.
Status: Ready for Review ✅
Successfully rebased on the latest version of PR #13256 as requested.
Recent Updates
Latest Rebase (commit 420cfb9):
TryGetCommandLineAPINodeModeHelper.ExtractFromCommandLineintegrationImplementation Details
Core Implementation:
src/Build/BackEnd/Components/Communications/NodeProviderOutOfProcBase.cs(+114 lines)ShutdownConnectedNodes: Modified to use per-node reuse decisionsDetermineNodesForReuse: Core logic for selective node reuseGetNodeReuseThreshold: Virtual method (default: NUM_PROCS/2)CountSystemWideActiveNodes: Uses improved detection from PR Improve cross-platform node discovery for reuse with NodeMode filtering #13256GetPossibleRunningNodes(expectedNodeMode: NodeMode.OutOfProcNode)Tests:
src/Build.UnitTests/BackEnd/NodeProviderOutOfProc_Tests.cs(+169 lines, new file)How It Works
When a build completes and
ShutdownConnectedNodes(enableReuse=true)is called:NodeBuildCompletepackets per nodeBenefits of PR #13256 Integration
The improved node detection from PR #13256 (latest version) provides:
Customer Impact
Reduces resource consumption in scenarios with concurrent builds (DevKit, LLM agents, CI/CD pipelines). System will maintain reasonable node count (NUM_PROCS/2) instead of accumulating 25+ lingering nodes.
Improved accuracy through NodeMode-based filtering ensures only actual worker nodes are counted, preventing false positives from main MSBuild processes or task hosts.
Thresholds
Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.